10 research outputs found

    Lower Bounds for Two-Sample Structural Change Detection in Ising and Gaussian Models

    Full text link
    The change detection problem is to determine if the Markov network structures of two Markov random fields differ from one another given two sets of samples drawn from the respective underlying distributions. We study the trade-off between the sample sizes and the reliability of change detection, measured as a minimax risk, for the important cases of the Ising models and the Gaussian Markov random fields restricted to the models which have network structures with pp nodes and degree at most dd, and obtain information-theoretic lower bounds for reliable change detection over these models. We show that for the Ising model, Ω(d2(logd)2logp)\Omega\left(\frac{d^2}{(\log d)^2}\log p\right) samples are required from each dataset to detect even the sparsest possible changes, and that for the Gaussian, Ω(γ2log(p))\Omega\left( \gamma^{-2} \log(p)\right) samples are required from each dataset to detect change, where γ\gamma is the smallest ratio of off-diagonal to diagonal terms in the precision matrices of the distributions. These bounds are compared to the corresponding results in structure learning, and closely match them under mild conditions on the model parameters. Thus, our change detection bounds inherit partial tightness from the structure learning schemes in previous literature, demonstrating that in certain parameter regimes, the naive structure learning based approach to change detection is minimax optimal up to constant factors.Comment: Presented at the 55th Annual Allerton Conference on Communication, Control, and Computing, Oct. 201

    Two studies in resource-efficient inference: structural testing of networks, and selective classification

    Get PDF
    Inference systems suffer costs arising from information acquisition, and from communication and computational costs of executing complex models. This dissertation proposes, in two distinct themes, systems-level methods to reduce these costs without affecting the accuracy of inference by using ancillary low-cost methods to cheaply address most queries, while only using resource-heavy methods on 'difficult' instances. The first theme concerns testing methods in structural inference of networks and graphical models, the proposal being that one first cheaply tests whether the structure underlying a dataset differs from a reference structure, and only estimates the new structure if this difference is large. This study focuses on theoretically establishing separations between the costs of testing and learning to determine when a strategy such as the above has benefits. For two canonical models---the Ising model, and the stochastic block model---fundamental limits are derived on the costs of one- and two-sample goodness-of-fit tests by determining information-theoretic lower bounds, and developing matching tests. A biphasic behaviour in the costs of testing is demonstrated: there is a critical size scale such that detection of differences smaller than this size is nearly as expensive as recovering the structure, while detection of larger differences has vanishing costs relative to recovery. The second theme concerns using Selective classification (SC), or classification with an option to abstain, to control inference-time costs in the machine learning framework. The proposal is to learn a low-complexity selective classifier that only abstains on hard instances, and to execute more expensive methods upon abstention. Herein, a novel SC formulation with a focus on high-accuracy is developed, and used to obtain both theoretical characterisations, and a scheme for learning selective classifiers based on optimising a collection of class-wise decoupled one-sided risks. This scheme attains strong empirical performance, and admits efficient implementation, leading to an effective SC methodology. Finally, SC is studied in the online learning setting with feedback only provided upon abstention, modelling the practical lack of reliable labels without expensive feature collection, and a Pareto-optimal low-error scheme is described

    Doubly-Optimistic Play for Safe Linear Bandits

    Full text link
    The safe linear bandit problem (SLB) is an online approach to linear programming with unknown objective and unknown round-wise constraints, under stochastic bandit feedback of rewards and safety risks of actions. We study aggressive \emph{doubly-optimistic play} in SLBs, and their role in avoiding the strong assumptions and poor efficacy associated with extant pessimistic-optimistic solutions. We first elucidate an inherent hardness in SLBs due the lack of knowledge of constraints: there exist `easy' instances, for which suboptimal extreme points have large `gaps', but on which SLB methods must still incur Ω(T)\Omega(\sqrt{T}) regret and safety violations due to an inability to refine the location of optimal actions to arbitrary precision. In a positive direction, we propose and analyse a doubly-optimistic confidence-bound based strategy for the safe linear bandit problem, DOSLB, which exploits supreme optimism by using optimistic estimates of both reward and safety risks to select actions. Using a novel dual analysis, we show that despite the lack of knowledge of constraints, DOSLB rarely takes overly risky actions, and obtains tight instance-dependent O(log2T)O(\log^2 T) bounds on both efficacy regret and net safety violations up to any finite precision, thus yielding large efficacy gains at a small safety cost and without strong assumptions. Concretely, we argue that algorithm activates noisy versions of an `optimal' set of constraints at each round, and activation of suboptimal sets of constraints is limited by the larger of a safety and efficacy gap we define.Comment: v2: extensive rewrite, with a much cleaner exposition of the theory, and improvements in key definition

    Strategies for Safe Multi-Armed Bandits with Logarithmic Regret and Risk

    Full text link
    We investigate a natural but surprisingly unstudied approach to the multi-armed bandit problem under safety risk constraints. Each arm is associated with an unknown law on safety risks and rewards, and the learner's goal is to maximise reward whilst not playing unsafe arms, as determined by a given threshold on the mean risk. We formulate a pseudo-regret for this setting that enforces this safety constraint in a per-round way by softly penalising any violation, regardless of the gain in reward due to the same. This has practical relevance to scenarios such as clinical trials, where one must maintain safety for each round rather than in an aggregated sense. We describe doubly optimistic strategies for this scenario, which maintain optimistic indices for both safety risk and reward. We show that schema based on both frequentist and Bayesian indices satisfy tight gap-dependent logarithmic regret bounds, and further that these play unsafe arms only logarithmically many times in total. This theoretical analysis is complemented by simulation studies demonstrating the effectiveness of the proposed schema, and probing the domains in which their use is appropriate

    Limits on testing structural changes in Ising models

    Full text link
    We present novel information-theoretic limits on detecting sparse changes in Ising models, a problem that arises in many applications where network changes can occur due to some external stimuli. We show that the sample complexity for detecting sparse changes, in a minimax sense, is no better than learning the entire model even in settings with local sparsity. This is a surprising fact in light of prior work rooted in sparse recovery methods, which suggest that sample complexity in this context scales only with the number of network changes. To shed light on when change detection is easier than structured learning, we consider testing of edge deletion in forest-structured graphs, and high-temperature ferromagnets as case studies. We show for these that testing of small changes is similarly hard, but testing of \emph{large} changes is well-separated from structure learning. These results imply that testing of graphical models may not be amenable to concepts such as restricted strong convexity leveraged for sparsity pattern recovery, and algorithm development instead should be directed towards detection of large changes.https://arxiv.org/abs/2011.0367

    Piecewise linear regression via a difference of convex functions

    Full text link
    We present a new piecewise linear regression methodology that utilizes fitting a difference of convex functions (DC functions) to the data. These are functions f that may be represented as the difference _1- _2 for a choice of convex functions _1,_2. The method proceeds by estimating piecewise-liner convex functions, in a manner similar to max-affine regression, whose difference approximates the data. The choice of the function is regularised by a new seminorm over the class of DC functions that controls the _∞ Lipschitz constant of the estimate. The resulting methodology can be efficiently implemented via Quadratic programming even in high dimensions, and is shown to have close to minimax statistical risk. We empirically validate the method, showing it to be practically implementable, and to have comparable performance to existing egression/classification methods on real-world datasets.http://proceedings.mlr.press/v119/siahkamari20a/siahkamari20a.pdfPublished versio
    corecore